Association between IRF6, TP63, GREM1 Gene Polymorphisms and Non-Syndromic Orofacial Cleft Phenotypes in Vietnamese Population: A Case–Control and Family-Based Study

This study aims to identify potential variants in the TP63–IRF6 pathway and GREM1 for the etiology of non-syndromic orofacial cleft (NSOFC) among the Vietnamese population. By collecting 527 case–parent trios and 527 control samples, we conducted a stratified analysis based on different NSOFC phenotypes, using allelic, dominant, recessive and over-dominant models for case–control analyses, and family-based association tests for case–parent trios. Haplotype and linkage disequilibrium analyses were also conducted. IRF6 rs2235375 showed a significant association with an increased risk for non-syndromic cleft lip and palate (NSCLP) and cleft lip with or without cleft palate (NSCL/P) in the G allele, with pallele values of 0.0018 and 0.0003, respectively. Due to the recessive model (p = 0.0011) for the NSCL/P group, the reduced frequency of the GG genotype of rs2235375 was associated with a protective effect against NSCL/P. Additionally, offspring who inherited the G allele at rs2235375 had a 1.34-fold increased risk of NSCL/P compared to the C allele holders. IRF6 rs846810 and a G-G haplotype at rs2235375–rs846810 of IRF6 impacted NSCL/P, with p-values of 0.0015 and 0.0003, respectively. In conclusion, our study provided additional evidence for the association of IRF6 rs2235375 with NSCLP and NSCL/P. We also identified IRF6 rs846810 as a novel marker associated with NSCL/P, and haplotypes G-G and C-A at rs2235375–rs846810 of IRF6 associated with NSOFC.


Introduction
To date, approximately 1 in 700 children worldwide [1], and 1.4 out of 1000 live births in Vietnam [2], are born with orofacial clefts (OFCs), and many of them experience a range of functional, psychological and aesthetic problems, including feeding, hearing and speech difficulties [1].Depending on the region, ethnicity and socioeconomic status, the Genes 2023, 14, 1995 2 of 18 prevalence of OFC differs and it is observed more frequently among males than females.The majority of OFCs are non-syndromic orofacial cleft (NSOFC), while approximately 30-50% are syndromic [3].The most popular classification divides NSOFC into four major forms, including cleft lip and palate (NSCLP), cleft lip only (NSCLO), cleft palate only (NSCPO) and atypical facial cleft.Besides this, NSCLP and NSCLO are also combined in a unique group called cleft lip with or without cleft palate (NSCL/P) due to the resemblance of both the epidemiologic features and embryologic timing [4].The identification of the etiology of birth defects, including NSOFC, has been explored for centuries and the possible contribution of genetic and environmental risk factors has been a subject of debate.In essence, genes, endogenous and exogenous environmental factors, and the interplay between multiple genes and the environment, are all widely accepted to play a role through different mechanisms [4].
Recently, there has been significant statistical and biological support for several susceptibility loci of NSOFC.Through genome-wide association studies (GWAS), researchers have successfully identified common and rare variants, and their interactions with environmental factors [5][6][7].Of these loci, 1q32, which includes Interferon Regulatory Factor 6 (IRF6) that belongs to a family of transcription factors, plays a crucial role in the balance between immature ectoderm differentiation and proliferation during craniofacial morphogenesis.Kondo et al. [8] cloned human IRF6, which is well-documented as a causal gene for Van der Woude syndrome (VWS) and Popliteal Pterygium Syndrome, two disorders that can include isolated cleft lip (CLO) and isolated cleft palate (CPO) anomalies.Variants in recognized genes now explain approximately 20% of the pathogenesis of NSOFC, of which variants in IRF6 account for 12% [9].In the last few decades, GWAS have identified at least 43 genes/loci associated with NSOFC, and IRF6 is one of the most related genes among different populations [5,10].
The linkage between p63 and IRF6 was established with IRF6 being the direct target of TP63, which is located on chromosome 3q28 and encodes the tumor protein p63, during palatal development [11].The p63 protein, another transcription factor, plays an essential role in regulating ectodermal cell differentiation and embryonic development, including the development of the epidermis, limbs and craniofacial tissues [12].It has been proven that p63 has a role in activating IRF6 transcription via the IRF6 enhancer element called MCS9.7 [13]; a mutation within this enhancer element increases the chances of NSCL/P [14].Some studies recently reported that PBX proteins, transcription factors in a homeobox protein family, and SHH, Sonic Hedgehog signaling protein, are essential for embryonic development and bind to midfacial regulatory elements to regulate the expression of wingless-type MMTV integration site (WNT) signaling at the lambdoidal junction (where the maxillary, medial nasal and lateral nasal processes fuse) [15,16].WNT signaling is effective in most tissues during craniofacial development and required for body axis patterning, cell fate specification, cell proliferation [17,18] and palatogenesis [19].In turn, WNT signaling of Wnt9b-Wnt3 controls p63, which belongs to a feedback loop with Irf6, the mouse ortholog of IRF6 [15,16].Another factor that affects IRF6 through binding to the MCS9.7 enhancer on the binding site in the IRF6 is the AP2A transcription factor, whose gene TFAP2A was reported to be responsible for the cleft etiology in our previous study [20].
Gremlin 1 (GREM1) was initially described as an association factor of NSOFC risk due to displaying suggestive genome-wide significance in GWAS [5].GREM1 encoded a protein in the bone morphogenetic protein (BMP) antagonist family and mediated the regulation of the dynamic interactions of various epithelial and mesenchymal signaling centers.GREM1 binds to BMP2, BMP4 and BMP7, all of which are expressed in the developing palatal shelves and prevent engagement [21].GREM1 can bind to a BMP4 precursor intracellularly, preventing BMP4 secretion and, consequently, creating GREM1-BMP4 interactions.GREM1 is also essential for early limb outgrowth and patterning because of its role in the SHH-GREM1-FGF feedback loop [22].FGF, fibroblast growth factor, plays a role in cellular proliferation, migration and differentiation, mitogenesis, angiogenesis, embryogenesis and wound healing [23].The FGF signaling pathway regulates multiple developmental processes, including palatogenesis [24].
While several pieces of evidence support the contribution of IRF6 and its related genes to the etiology of NSOFC, deeper insights into the various subtypes of NSOFC should be obtained.In the present study, to assess whether a statistical association exists between the disease trait and the selected markers, we evaluated the hypothesis that genetic variants located in and around the IRF6, TP63 and GREM1 genes are associated with pathological NSOFC phenotypes in the Vietnamese population by using case-parent trios and case-control analysis.

Sample Study
This study was based on combining 2 complementary approaches: case-parent trio (a case and its parents) study and case-control study.All the affected participants were carefully screened for the absence of associated anomalies or syndromes by the maxillofacial surgeon and diagnosed with NSOFC based on clinical examination, treatment and medical records at the Odonto and Maxillofacial Hospital of Ho Chi Minh City (HCMC) in Vietnam from 2013 to 2019.A total of 527 Vietnamese patient-unaffected parent trios were included in this study.Of these, 101 trios had an index case with NSCPO, 172 trios with NSCLO and 254 trios with NSCLP.Besides these, 527 ethnically and region-matched healthy controls without a cleft history in their families were recruited for a case-control design.All subjects self-identified as Vietnamese, providing their name, gender and age and identifying as the Kinh people.All cases were selected strictly according several inclusion criteria, including (1) congenital NSOFC presence; (2) no other congenital malformations or acquired diseases of other systemic organs; (3) no family history of other genetic diseases; (4) both parents of the patient were mentally healthy and willing to complete the questionnaire.Inclusion criteria for the unrelated control group were as follows: (1) the absence of any congenital diseases or malformations of the systemic organs; (2) no family history of NSOFC or other genetic diseases for at least three generations; and (3) sound mental health and a willingness to participate in the questionnaire.People who did not meet the inclusion criteria were ruled out from the study, both in case and control samples.
The estimated sample size, power and effect size calculation were calculated by requirement in the case of a 1:1 ratio to achieve 80% statistical power under the assumption of a 5% α level for the case-control and case-parent designs.
Peripheral blood samples were collected on dried blood cards and stored at the World Cleft Gene Bank in Aichi Gakuin University, Nagoya, Japan.In both the case and control samples, informed consent for participation was collected from all participants, either from the participants themselves or from their parents in the case of children under the age of 18.All procedures were performed in accordance with the ethical guidelines laid down by the Declaration of Helsinki (World Medical Association 2013), and the study was approved by the Aichi Gakuin University Ethics Committee (Number 78).

Single-Nucleotide Polymorphism Selection and Genotyping
Single-nucleotide polymorphism (SNP) markers of interest were searched against the db-SNP database (http://www.ncbi.nlm.nih.gov/projects/SNP/(accessed on 13 January 2022)) and the 1000 Genomes Browser (http://browser.1000genomes.org/index.html(accessed on 13 January 2022)) to determine whether they had previously been reported and whether they had been described as pathogenic or nonpathogenic.In total, 5 SNPs (2 for IRF6: rs2235375, rs846810; 2 for GREM1: rs2280738, rs1258763; and 1 for TP63: rs9332461) were selected for genotyping.The common criterion for selecting the SNP was a minor allele frequency (MAF) >5% from the 1000 Genomes database of the Asian population [25].Individually, rs2235375 is located in intron 6 of IRF6 and has been discovered to have a strong correlation with rs2235371 in exon 7, which alters the conserved amino acid valine to isoleucine at codon 274 (p.V274I) in the SMIR-binding domain of IRF6 [26].Rs2235371 (the p.V274I polymorphism) is the most significant SNP of IRF6 associated with NSOFC and has been reported to have statistical associations in various populations [9].In addition, rs9332461 of TP63 and rs2280738 and rs1258763 of GREM1 were selected for genotyping based on previous GWAS and association studies [5,27].Besides these known associated SNPs, we chose rs846810 of IRF6, which has not previously been announced as a causative variant of NSOFC in GWAS and replication studies.IRF6 rs846810 is located in intron 5 of IRF6 and is considered a potential marker for the linkage disequilibrium (LD) test in IRF6.Furthermore, rs846810 plays roles in transcriptional regulation by changing the binding of transcriptional factors generated by HaploReg v4.2 (https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php(accessed on 6 July 2023)).
The QIAamp DNA Blood Mini Kit (QIAGEN) was used to extract genomic DNA from dried blood spots according to the manufacturer's instructions.The purity of the DNA was confirmed by spectrophotometric tests.All samples were successfully genotyped, with a genotype call rate of >98%, using the TaqMan TM SNP Genotyping Assay (Applied Biosystems, Foster City, CA, USA) in the 7900 HT Fast Real-Time PCR system (Applied Biosystems, Foster City, CA, USA), according to the manufacturer's guidelines.

Statistical Analysis
We tested each SNP for an association with NSOFC subtypes.Adherence to Hardy-Weinberg equilibrium (HWE) was assessed for all SNPs using the healthy control group.In the case-control study, the allelic (Allele), genotypic (Geno), dominant (Dom), recessive (Rec) and over-dominant (Over) models of each marker among samples were compared with Chi-square tests and 95% confidence intervals (CIs) for the odds ratios (ORs).In the dominant model, the major allele homozygote effect was compared with the combining effect of heterozygotes and minor allele homozygotes.In the recessive model, the combining effect of the major allele homozygotes and heterozygotes was compared with minor allele homozygotes.The over-dominant model showed a difference in the heterozygote effect compared with the combining effect of the major and minor allele homozygotes.We also implemented the transmission disequilibrium test (TDT), the parent-of-origin effects (POO) and SNP × SNP epistasis for a case-control population-based sample analysis in PLINK (v1.07; http://pngu.mgh.harvard.edu/purcell/plink/(accessed on 16 March 2023)), which generated an asymptotic p-value.The LD and haplotype with each phenotype in the case-control and case-parent analysis were computed by using Haploview 4.2 (http://www.broad.mit.edu/mpg/haploview/(accessed on 16 March 2023)).Haplotypes with a frequency < 0.03 in these cleft trios were considered rare and thus excluded from statistical analysis.
Bonferroni corrections for multiple SNPs were applied.We used Bonferroni's correction for 25 tests (5 SNPs × 5 phenotypic groups) and 30 tests (6 haplotypes × 5 phenotypic groups) to determine the critical thresholds in formal significance for the allelic association study, SNP interactions, TDT and POO tests (p = 0.002) and haplotype analysis (p = 0.0017), respectively.

Single-Marker Association Analysis
The results of the HWE test and MAF are listed in Table 2, with none of the tested polymorphisms showing a significant deviation from HWE in healthy individuals (p > 0.05).The MAF for analyzed nucleotide variants was at least 6%, similar to the MAFs reported in the KHV population (Kinh in Ho Chi Minh City, Vietnam) from the 1000 Genomes database [25].

Case-Control Comparison and Haplotype Analysis
In the IRF6 gene, rs2235375 showed a significant association with NSCLP (p = 0.0018) and NSCL/P (p = 0.0003) in the allelic model.A low p-value in the genotypic model (p Geno ) was found for rs2235375 in NSCL/P (p = 0.0014), which appeared within a significance level of p ≤ 0.002 after the Bonferroni correction (Table 3).Under the assumption of a recessive inheritance, the calculated p-value of the recessive genetic model (p Rec ) for rs2235375 in NSCL/P was 0.0011.IRF6 rs846810 showed a significant difference in allele frequency between NSCL/P and the control group, with a p-value of 0.0015 and OR of 1.41 with 95% CI 1.14-1.74.None of these SNPs showed a significant association with NSCPO or NSCLO (Table 3).
Regarding the TP63 gene, we did not detect a significant association between NSOFC phenotypes and rs9332461 within the TP63 promoter region (Table 3).
We observed two pathogenic variants in GREM1 s exon and the GREM1-FMN1 intergenic region: rs2280738 and rs1258763.However, neither of these variants showed an association with the Vietnamese cleft sample in this study (Table 3).Additionally, these two SNPs exhibited a moderate LD with each other, with D' ≥ 0.59 among NSOFC phenotypes (Figure 1).Regarding the TP63 gene, we did not detect a significant association between NSOFC phenotypes and rs9332461 within the TP63 promoter region (Table 3).
We observed two pathogenic variants in GREM1′s exon and the GREM1-FMN1 intergenic region: rs2280738 and rs1258763.However, neither of these variants showed an association with the Vietnamese cleft sample in this study (Table 3).Additionally, these two SNPs exhibited a moderate LD with each other, with D' ≥ 0.59 among NSOFC phenotypes (Figure 1).The LD plot in Figure 1 indicates that the alleles at rs2235375 and rs846810 are strongly correlated.The D values (≥0.88) for the pairwise LD comparisons between alleles at these SNPs are close to 1.0 (for strong LD).Because they are in strong LD, Haploview groups these SNPs into a common haplotype block.We analyzed the haplotypes based on these two SNPs of IRF6 and two SNPs of GREM1 to confirm the two-marker haplotypes associated with the risk of NSOFC phenotypes (Table 4).Statistical significance was mainly observed for haplotypes comprising the minor allele (G) of rs2235375 and the G allele of rs846810.The G-G (rs2235375-rs846810 of IRF6) haplotype was associated with the etiology of NSCL/P and NSOFC (p-values of 0.0003 and 0.0012, respectively).Moreover, the C-A (rs2235375-rs846810 of IRF6) haplotype was significant with NSCL/P (p = 0.0011) in the case-control analysis.For the GREM1 gene, there was no statistical association detected with the NSOFC groups in the two-marker haplotypes (C-C, G-C and C-T) for rs2280738 and rs1258763.

Family-Based Association Study and Haplotype Analysis
Table 5 summarizes the results of the TDT analysis of the family-based association of IRF6, TP63 and GREM1 with NSOFC phenotypes in trios in the Vietnamese cohort.Among all five SNPs, only rs2235375 showed a significant association between NSCL/P and the G allele as the over-transmission allele, with a p-value of 0.002 and OR of 1.34 with 95% CI 1.11-1.63after Bonferroni correction.
Parent-of-origin effects were also examined by separating paternal and maternal alleles.Despite being confirmed by TDT analysis, rs2235375 showed no evidence of excess parental transmission in the POO analysis (Table 5).
When conducting haplotype-based transmission disequilibrium analyses with Haploview, we failed to identify the parental over-transmission of haplotypes in IRF6 and GREM1 after p-value adjustment (Table 4).

SNP × SNP Epistasis
We performed pair-wise SNP × SNP epitasis between different genes to analyze numerous suggested potential interactions.However, none of the pairwise SNP × SNP interactions reached the adjusted significance (Table 6).

Discussion
An understanding of the implication of risk factors for NSOFC in diverse racial populations seems to provide more insights into the fundamental etiology.GWAS have successfully identified several related genes and loci that contribute to the high genetic heterogeneity underlying this malformation.However, it is limited in generalizability to other populations and tends to ignore susceptible genes/loci for specific NSOFC subtypes.Further studies with case-control and case-parent trio designs should be useful to unveil the causative factors of NSOFC, and it is vital to explore the differences in diverse populations regarding the same genes/markers.Besides this, linkage studies have provided new tools to detect various possible loci that could have a causal role in cleft lip and palate pathogenesis.The case-control design is the most popular approach to gene mapping in complex traits for a specific population.However, it has been shown that the weakness of this design may generate false associations as an effect of population stratification.Familybased trio design is an adequate method to address the effect of population stratification.This approach allows us to detect preferentially transmitted alleles based on parental and proband genotypes [28].In genetic association studies, a statistically significant association is observed if the p-value falls below a preset threshold to reject the null hypothesis of a genetic association.Analyzing a substantial number of SNP markers results in numerous comparisons, thereby increasing the false positive rate.To avoid this issue, Bonferroni adjustment was applied to control the occurrence of false positives (type I errors).At the significance level of p < 0.002, the power of the association study of the significant markers in IRF6 was more than 80%, which was calculated using the Genetic Association Study (GAS) Power Calculator from the CaTS Power Calculator [29].Additionally, determining an appropriate sample size is crucial in the initial phase of designing a genetic association study to ensure sufficient statistical power [30].
Genetic research on oral clefts in the Vietnamese population has been limited.However, there have been some notable studies and reports that have shed light on potential genetic factors contributing to NSOFC, including the transmission distortion for alleles of MSX1 [31]; the V274I polymorphism (rs2235371) in IRF6 [9]; rs9429829 of SYT14, 17 SNPs in the 10q25.3region and the rs227731 variant from 17q22 [32]; rs2237493 of MEOX2 [33]; and rs1675414 of TFAP2A in NSCLO, as previously reported [20].
Variants in IRF6 are associated with multiple phenotypes of OFC.For example, structural mutations of IRF6 cause VWS or Popliteal Pterygium Syndrome.With the incomplete penetrance of VWS, the isolated OFC, lip pits alone, dental anomalies or even nondiscernible phenotypes could be included in the least severe end of the VWS spectrum.Because of this variation, whether IRF6, one of nine members of a family of transcription factors (IRFs), causes NSOFC remains controversial.Nevertheless, many studies have shown common alleles in IRF6 associated with NSOFC and this has made IRF6 the most frequently studied gene related to NSOFC.This has been independently confirmed in GWAS [6] and candidate gene studies [9], while animal models have revealed that Irf6 is involved in craniofacial morphogenesis and ectodermal formation [34].Among these proven variants, the rs2235371 (V274I) polymorphism emerges as the most plausible marker for NSOFC.However, as Park et al. pointed out, "Significant results observed from SNPs other than rs2235371 (p.V274I) suggest that rs2235371 itself is not causal, but rather in LD with some causal mutation in IRF6" [35].rs2235375, located in intron 6 of IRF6, has been reported to have significant LD with the V274I locus [26].In addition, rs2235375 has shown evidence of an association, with an increase in DNA methylation and a decrease in expression in cerebellar tissues [36].In our current study, with a Vietnamese case-control sample, rs2235375 showed a significant association with the NSCLP and NSCL/P groups, with p allele values of 0.0018 and 0.0003, respectively, indicating an increase in the G allele as a risk for NSCLP and NSCL/P.More specifically, with a p Geno of 0.0014, an association with a p-value of 0.0003 with an OR value of 1.94 (95% CI 1.35-2.80)(not shown in the table) was seen with the increase in homozygous genotype GG compared with CC in the NSCL/P phenotype.In the recessive models, NSCL/P was associated with a protective effect of rs2235375 by decreasing the number of GG genotypes (p rec = 0.0011).On the one hand, these results are consistent with various previous reports in different populations, such as Italian [37], Chinese [38], Chilean [39] and Norwegian [26].On the other hand, rs2235375 remained non-associated with the Mexican cleft population in the Velázquez-Aragón et al. study [40].This difference could be attributed to the diversity of the racial populations.The TDT results in the triad analysis are worth mentioning, with the significant over-transmission of the G allele with NSCL/P (p = 0.0020; OR = 1.34), suggesting that offspring who inherited the G allele at rs2235375 had a 1.34-fold increased risk of NSCL/P compared to the C allele holders.The same results were seen in Mexican [41], European-American [37], Taiwanese, Singaporean, Korean and Western Chinese case-parent triads in a genome-wide TDT anal-ysis [35].However, we did not detect significant distortion in the allele transmission of rs2235375 from parents to affected progeny in any cleft phenotypes using the POO analysis.These results arise from the different approaches between the two family-based methods.TDT avoids false positive associations by testing the difference between the frequency of marker alleles transmitted from heterozygous parents to the affected children and the frequency of marker alleles not transmitted.In comparison, POO looks separately for further possible roles between paternal and maternal genetic effects.
The most noteworthy aspect of this study regarding IRF6 is the novel SNP, which has never been reported in both previous GWAS and replicate studies, namely rs846810.rs846810 showed a strong association with the NSCL/P group (p = 0.0015).An increase in the frequency of the G allele at this site could be associated with an elevated risk of NSCL/P.rs846810, located in intron 5 of IRF6, was in LD with rs2235375 (D' ≥ 0.88, r 2 ≥ 0.26) throughout cleft groups.In the haplotype analysis, the increase in G-G and/or decrease in C-A haplotypes in IRF6 contributed to the etiology of NSCL/P.Although IRF6 rs846810 was located on 1q32.2 and had no direct influence on the structure of the protein, we used HaploReg to generate a motif analysis.HaploReg is a tool for exploring the non-coding variants and their haplotype blocks, and the effects of SNPs on regulatory motifs.HaploReg indicated that rs846810 could change the transcription factor binding site (Foxp1, rf_disc3, Irf_known9, etc.).Thus, this marker might play indirect roles in the pathogenesis of NSOFC by modulating the corresponding transcription factors of IRF6.
In both the case-parent and haplotype analyses of NSOFC triads, no significant association was detected for rs846810 and IRF6 haplotypes.However, the G-G haplotype of IRF6 was nominally significant with NSCL/P (p = 0.0046) and is worthy of consideration for another independent replication study in the future.This is the first time that these haplotypes in IRF6 have been reported and rs846810 has been addressed as a causal genetic variant of NSCL/P.Therefore, rs846810 should be considered for investigation in various communities for the comparison of racial genetic diversity.
Variants in TP63 have been reported as an etiology for variable features, including, but not limited to, ectrodactyly-ectodermal dysplasia cleft syndrome, Rapp-Hodgkin syndrome, split-hand/foot malformation, limb-mammary syndrome and ADULT syndrome [42].TP63 gene transcription can start from two distinct promoters, resulting in two isoforms: the TAp63 isoform and the ∆Np63 isoform.The TP63 gene also contains versatile regulatory elements in its large intronic regions.The linkage between IRF6 and TP63 has been proven in numerous previous studies, especially the ∆Np63 isoform.IRF6 was discovered to promote ∆Np63 protein degradation in both mouse palatal shelves and human keratinocytes, expressed in all embryonic stages during epidermal, tooth and hair development.In contrast, the TAp63 isoforms are not detected until the late embryonic stage [43].The distinct contribution between the two isoforms to cleft etiopathogenesis was partly confirmed by various SNPs in the intron 1 region of TP63, which was proven to have no significant association with NSCLP [14,27].rs9332461 was chosen from the upstream region of TP63, which contains the promoter region and various regulatory elements of TP63.HaploReg suggests that TP63 rs9332461 could change the transcription factor Nkx2_1 in brain, lung and thyroid development [44].Moreover, research in multiplex populations has shown that rs9332461 may confer an increased risk for NSCLP [27].However, it is worth noting that the significance and association of rs9332461 can vary among different ethnicities.Therefore, additional studies in diverse populations are required to validate and expand our understanding of the role of this SNP in cleft diseases.
GREM1, a downstream target of BMP signaling, is required for limb bud development, where it validates a positive SHH-FGF feedback loop through the restriction of BMP signaling.GREM1 also induces BMP-independent phenotypes, including the induction of proliferation and apoptosis [45] and the control of monocyte migration, adhesion and apoptosis via the inhibition of macrophage migration inhibitory factor release [46].Additionally, the contribution of GREM1 is indisputable for the embryonic process (kidney, lung and bone development).Ludwig et al. demonstrated the GREM1 expression in the fusion region of the maxillary and medial nasal processes during mouse embryogenesis at E12.5 (embryonic day 12.5), and between E12.5 and E15. 5 [47].For the 15q13 locus, which contained rs2280738 and rs1258763 in our study, evidence for long-distance regulatory effects on Grem1 has been provided by studies in other mammals, and the Fmn1 gene was found necessary for the cis-regulation of Grem1 transcription [48].
rs2280738 was chosen from the exonic region and rs1258763 was represented in the GREM1-FMN1 intergenic region that could be deputized for GREM1.Neither rs2280738 nor rs1258763 showed significant differences in the distribution of alleles, genotypes, haplotype association and TDT and POO analysis among NSOFC phenotypes with our correction for multiple comparisons.Few genetic studies have been carried out regarding the impact of GREM1 variants on the risk of NSOFC, and the results vary both in GWAS [5,47] and among different populations through association studies [32,49,50].rs2280738 was presented as a rare GREM1 mutation for the pedigrees of families in Ludwig et al.'s report [47] and this was replicated with the Chinese population [51].In contrast to our results, Han Chinese citizens were positively associated with rs2280738 in the NSCLO group [51].Our findings support the argument that GREM1 has a low number of etiologic exonic variants in cleft patients [49,50].Regarding rs1258763, various previous reports hold opposite opinions on this variant's contribution to NSOFC in the European, Chinese and Brazilian populations.Ludwig et al. suggested that rs1258763 was significantly associated with an increased risk for NSCLP [47], while Mostowska et al. asserted that the minor allele (G) of rs1258763 was supported in decreasing the risk of NSOFC among the Polish [50].Many subsequent studies have agreed with the relation between rs1258763 and cleft etiology in the Chinese [52] and Brazilian populations [53].Furthermore, the rs1258763 minor allele is significantly associated with nasal width reduction and this association was found to be stronger in males [54].Our negative association results suggest that its conflicting effects may result from genetic heterogeneity in different ethnicities, even in the Asian race, where the frequency of susceptibility variants may differ across populations.Of course, there are other possibilities, such as the fact that the differences may be due to the low statistical power, including our limited sample sizes and differences in the experimental design and methods.Besides these, our results indicate that the GREM1-FMN1 intergenic region might not contain causal variants, at least in the Vietnamese population.The question of whether the 15q13.3locus, where rs1258763 is located, plays a role in the genetic susceptibility to NSOFC remains open and requires further research.
In the present work, we also found moderate LD between GREM1 rs2280738 and rs1258763, with 0.54 ≤ D' ≤ 0.63 but a very low r 2 .The low r 2 was derived from the rarer allele frequency of rs1258763 compared with rs2280738 in the Vietnamese population.
Haplotype-based association studies offer advantages over single-marker analysis by providing increased statistical power and the ability to detect rare variants that might not be well captured by single-marker analyses due to low allele frequencies, elucidating biological relevance [55], identifying interactions between SNPs at a locus, reducing the multiple testing burden and accounting for population structures more effectively [56].However, they may introduce complexity and require larger sample sizes.The choice between the two methods depends on the research question and the genetic architecture of the trait.However, combining both approaches can yield a more comprehensive understanding of genetic associations.
OFC is a complex congenital anomaly exhibiting an interaction between genetic and environmental factors [57].Although cleft lip and cleft palate often occur together, NSCL/P and NSCPO are two distinct groups of NSOFC based on embryological and epidemiological distinction [58].Several GWAS and meta-analyses have shown that while common variants strongly contribute to NSCL/P, they do not seem to affect NSCPO [6,59].NSCPO might be more often caused by rare deleterious variants and more vulnerable to environmental factors.Many studies also suggest that NSCLO and NSCLP might have separate genetic pathways [60].The IRF6 susceptibility in our study with NSCL/P and its subtypes (NSCLO and NSCLP) also confirmed the different genetic etiologies between NSCL/P and NSCPO.
Several limitations of our study need to be pointed out, such as the small size of the sample groups, the fact that it was not a multiracial study, the insufficient numbers of selected variant loci and the lack of analysis of the biological functions of the polymorphisms and gene-environmental factors.However, our study will open promising future research in Vietnamese cleft disorders, particularly regarding IRF6 and its related genes.This includes exploring the PBx-WNT-TP63-IRF6 pathway, studying the effect of IRF6 on various genes and investigating interactions within the SHH-GREM1-BMP4-FGF network.

Conclusions
In conclusion, we performed case-control and family-based evaluations combined with haplotype analysis on 527 Vietnamese trios among NSOFC phenotypes.This study investigated the role of IRF6, TP63 and GREM1 variants in the etiology of NSOFC and its phenotypes.Our study provided additional evidence for the association between IRF6 rs2235375 with NSCLP and NSCL/P and showed the significant over-transmission of the G allele with NSCL/P.We also identified the novel locus of IRF6 (rs846810) in association with NSCL/P and the rs2235375-rs846810 haplotypes (G-G and C-A) associated with NSCL/P and NSOFC.Our data did not support the direct involvement of TP63 and GREM1 in orofacial cleft diseases in the Vietnamese population.

Figure 1 .
Figure 1.Linkage disequilibrium (LD) analysis in each subtype of NSOFC; the values within each diamond represent the pairwise correlation (D' and r 2 ) between 2 variants of IRF6 in chromosome 1 and 2 SNPs of GREM1 in chromosome 15.

Table 1 .
Characteristics of the study sample.

Table 2 .
Minor allele frequency for each SNP and Hardy-Weinberg equilibrium test.

Table 4 .
The haplotype association analysis in case-control and case-parent samples.
Abbreviations: CC, case-control; TDT, transmission disequilibrium test; Freq., frequency; T/U, transmitted/not transmitted.In bold are p-values that were significant after adjustment with Bonferroni correction in multiple tests (p ≤ 0.0017).

Table 5 .
The transmission disequilibrium test (TDT) and the parent-of-origin (POO) likelihood ratio test of inequality between paternal and maternal.
Abbreviations: TDT, transmission disequilibrium test; POO, parent of origin; Pat, paternal; Mat, maternal.A1/A2, 2 alleles for each SNP; T/U, transmitted/not transmitted.In bold are p-values that were significant after adjustment with Bonferroni correction in multiple tests (p ≤ 0.002).

Table 6 .
The SNP × SNP epitasis for case-control population-based sample.